<html>
<link href="../enzo.css" rel="stylesheet" type="text/css">
  <head>
    <title>Running Enzo on Specific Machines</title>
  </head>
<body> 
    <h1>Running Enzo on Specific Machines</h1>
    <p>This page contains information on running enzo on various NPACI machines and platforms.
      Though by no means exhaustive, hopefully there will be enough useful links and scripts
      here to help you get started.  More platforms and machines will be added as time permits
      and as they come online.</p>
    
    <P>&nbsp;</p>

    <h2>SDSC Blue Horizon (IBM SP/Power 3) - horizon.npaci.edu</h2>
    <p>General information concerning running simulations on Blue Horizon can be found
      <a href="http://www.npaci.edu/BlueHorizon/">here</a>.  An example batch script
      is here: <a href="bh/enzo.job">enzo.job</a>, which calls a script named
      <a href="bh/go_enzo">go_enzo</a>.  The files enzo.job and go_enzo should be 
      placed in the directory that enzo is going to execute in, along with all of the
      files needed for execution (such as enzo, ring, the initial conditions and
      parameter files, and any additional files).  It would be useful for the user
      to examine the SDSC documentation on
      <a href="http://www.npaci.edu/BlueHorizon/guide/runjobs.html#BATCH">Running LoadLeveler Batch Jobs</a>,
      which has all of the relevant commands and script variables.  In particular, attention must
      be paid to the variables 'executable', 'node', 'tasks_per_node', 'wall_clock_limit' and 'class'.</p>

    <p>The batch job is submitted by typing:</p>

    <p><tt>llsubmit enzo.job</tt></p>

    <p>The status of a batch job can be checked by using the command <tt>llq</tt>, and 
      more detailed information about job execution and time-until-execution can be
      found by typing <tt>showq --full</tt>.</p>

    <p>It is advisable to run all enzo jobs in a directory on GPFS.  This allows enzo to
      take advantage of the parallel filesystem, and there is a great deal of disk space
      available on gpfs.  Your gpfs directory can be reached by typing <tt>cd /gpfs/USERNAME</tt>,
      where USERNAME is your login.  You may have to create this directory yourself.  Data can be
      backed up using <a href="http://www.npaci.edu/SRB/guide/getstart.html">SRB</a> 
      or <a href="http://www.npaci.edu/BlueHorizon/guide/getstart.html#ARCHIVAL">HPSS</a>.  Note
      that gpfs has a very liberal purge policy - files typically can stay on gpfs at least a week until
      they are purged.  Therefore, the scripts do not take care of backing up data while the job is 
      executing - that is left to the user to do afterward.  Note that it is better to have a few very
      large files rather than lots of tiny files for both SRB and HPSS, so you should tar up entire
      data dumps into a single large tarfile before putting them into mass storage.  You may find
      some useful stuff on the Enzo cookbook page for
      <a href="dataissues_top.html">data handling issues and useful scripts</a>.</p>

    <h2>SDSC DataStar (IBM Regatta/Power 4) - ds002.sdsc.edu</h2>
    <p>DataStar is the new IBM system (replacing Blue Horizon in roughly April 2004).  The
      operating environment is very similar to Blue Horizon, including the gpfs filesystem and
      the various archival options.  LoadLeveler commands are basically the same, though there 
      are some differences in the scripts (download them here: <a href="ds/enzo.job">enzo.job</a> 
      and <a href="ds/go_enzo">go_enzo</a>.  Note that binaries compiled on Blue Horizon can
      execute on DataStar (and vice versa), and they share an archival system and a parallel
      file system, and the same notes and warnings about purges and backups apply. You may find
      some useful stuff on the Enzo cookbook page for
      <a href="dataissues_top.html">data handling issues and useful scripts</a>.</p></p>

    <h2>NCSA Copper (IBM Regatta/Power 4) - cu.ncsa.uiuc.edu</h2>
    <p>General information on Copper is available 
      <a href="http://www.ncsa.uiuc.edu/UserInfo/Resources/Hardware/IBMp690/">here</a>, 
      with user documentation <a href="http://www.ncsa.uiuc.edu/UserInfo/Resources/Hardware/IBMp690/Doc/">here</a>.
      The operating environment on copper (cu.ncsa.uiuc.edu) is very similar to Blue Horizon and DataStar, with
      some modifcations.  Information on running jobs on Copper can be found
      <a href="http://www.ncsa.uiuc.edu/UserInfo/Resources/Hardware/IBMp690/Doc/Jobs.html">here</a>,
      and an example <a href="cu/copper_example.ll">batch script</a> is available.</p>

    <p>
      Perhaps the most significant difference between the NCSA and SDSC IBM machines concerns their use of scratch
      space.  While SDSC has a very liberal purge policy, the NCSA is much more stringent.  Files must be moved
      to a scratch directory by the job script, and there must be some mechanism within the script to back up
      simulation data to the NCSA archival system, 
      <a href="http://www.ncsa.uiuc.edu/UserInfo/Resources/Hardware/UniTree/">UniTree</a>.  We have an example of
      such a mechanism, a script called <a href="cu/squirrel">squirrel</a>, which runs in the background of a script
      while enzo is computing and takes care of backing up data to unitree.
    </p>

    <h2>NCSA Titan (IA-64 Linux cluster) - titan.ncsa.uiuc.edu</h2>
    <p>
      General information on Titan is available
      <a href="http://www.ncsa.uiuc.edu/UserInfo/Resources/Hardware/IA64LinuxCluster/">here</a>, with user documentation
      <a href="http://www.ncsa.uiuc.edu/UserInfo/Resources/Hardware/IA64LinuxCluster/Doc/">here</a>.  This machine runs
      linux and uses the Portable Batch System (PBS) and Maui Scheduler for running jobs.  See the NCSA page on 
      <a href="http://www.ncsa.uiuc.edu/UserInfo/Resources/Hardware/IA64LinuxCluster/Doc/Jobs.html">Running Jobs On Titan</a>
      and in particular the section on 
      <a href="http://www.ncsa.uiuc.edu/UserInfo/Resources/Hardware/IA64LinuxCluster/Doc/Jobs.html#Commands">batch commands</a> 
      for more information.  An example script for running enzo is available <a href="titan/titan_example_script.batch">here</a>.
      Intriguingly enough, Titan does not currently suffer from the same draconian purge policy as Copper, so it is straighforward
      for the user to back up their own data after the simulation is completed, and as with Copper, all data should be backed up
      to <a href="http://www.ncsa.uiuc.edu/UserInfo/Resources/Hardware/UniTree/">UniTree</a>. You may find
      some useful stuff on the Enzo cookbook page for
      <a href="dataissues_top.html">data handling issues and useful scripts</a>.</p>
    </p>

    <h2>PSC Lemieux (Compaq Alpha) - lemieux.psc.edu</h2>

    <p>Note - this section written by 
      <A href="mailto:dcollins@physics.ucsd.edu">David Collins</a>.  Please
      direct questions concerning specifics of running jobs on Lemieux to him.</p>


    <p>
      <a href="http://www.psc.edu/machines/tcs/lemieux.html"> General Information.</a> <br>
      <a href="http://www.psc.edu/machines/tcs/lemieux.html#running"> Running jobs. </a><br>
      <a href="http://www.psc.edu/machines/tcs/lemieux.html#qsub">A sample batch script can be found here. </a><br>
      Be patient while loading the previous two links, as the page is fairly
      long and the relevant information is somewhere in the middle.</p>

    <p>
      Lemieux is comprised of 750 Compaq Alphaserver ES45 nodes and two separate front
      end nodes. Each computational node contains four 1-GHz processors and runs
      the Tru64 Unix operating system. The nodes are connected by a a Quadrics 
      interconnect. Each node is a 4 processor SMP, wth 4 GB of memory shared among
      the processors.
      </p>

    <P>
      Each compute node has 38 Gbytes of local temporary disk space available to you as 
      <tt>$LOCAL</tt>. The <tt>$LOCAL</tt> space for each node is distinct and is 
      only visible to its node.</p>

    <P>
      Access to the local node disk is only available in batch scripts or in jobs.  
      Its acces is controlled by the 
      <a href="http://www.psc.edu/machines/tcs/lemieux.html#fileIO">tcsio routines</a>.
      Local disks are immediately cleaned off at the end of a run.</p>

    <P>Lemieux's archive system is called <a href="http://www.psc.edu/general/filesys/far/far.html">Golem</a>.
      It can be accessed with the <a href="http://www.psc.edu/machines/tcs/lemieux.html#far">far</a> 
      interface.<P>

     <p>To run a job, type <tt>qsub MyScript.sh</tt>.  To monitor and manipulate jobs in the queue  
      <a href="http://www.psc.edu/machines/tcs/lemieux.html#monitor">use these commands</a>.  Here is
      <A href="Lemieux/Myscript.sh">a sample batch script</a>.
 
    <p>&nbsp;</p>
    <p>
      <a href="index.html">Previous - Index</a><br>
      <a href="dataissues_top.html">Next - Data Issues</a><br>
</p>


<p>&nbsp;</p>
<p>
<a href="../index.html">Go to the Enzo home page</a>
</p>
<hr WIDTH="100%">
<center>&copy; 2004 &nbsp; <a href="http://cosmos.ucsd.edu">Laboratory for Computational Astrophysics</a><br></center>
<center>last modified February 2004<br>
by <a href="mailto:bwoshea (AT) lanl.gov">B.W. O'Shea</a></center>


  </body>
</html>
